332 research outputs found

    Understanding the Session Durability in Peer-to-Peer Storage System

    Full text link
    This paper emphasizes that instead of long-term availability and reliability, the short-term session durability analysis will greatly impact the design of the real large-scale Peer-to-Peer storage system. In this paper, we use a Markov chain to model the session durability, and then derive the session durability probability distribution. Subsequently, we show the difference between our analysis and the traditional Mean Time to Failure (MTTF) analysis, from which we conclude that the misuse of MTTF analysis will greatly mislead our understanding of the session durability. We further show the impact of session durability analysis on the real system design. To our best knowledge, this is the first time ever to discuss the effects of session durability in large-scale Peer-to-Peer storage system.Computer Science, Theory & MethodsSCI(E)EICPCI-S(ISTP)

    Molecular Model of Dynamic Social Network Based on E-mail communication

    Get PDF
    In this work we consider an application of physically inspired sociodynamical model to the modelling of the evolution of email-based social network. Contrary to the standard approach of sociodynamics, which assumes expressing of system dynamics with heuristically defined simple rules, we postulate the inference of these rules from the real data and their application within a dynamic molecular model. We present how to embed the n-dimensional social space in Euclidean one. Then, inspired by the Lennard-Jones potential, we define a data-driven social potential function and apply the resultant force to a real e-mail communication network in a course of a molecular simulation, with network nodes taking on the role of interacting particles. We discuss all steps of the modelling process, from data preparation, through embedding and the molecular simulation itself, to transformation from the embedding space back to a graph structure. The conclusions, drawn from examining the resultant networks in stable, minimum-energy states, emphasize the role of the embedding process projecting the non–metric social graph into the Euclidean space, the significance of the unavoidable loss of information connected with this procedure and the resultant preservation of global rather than local properties of the initial network. We also argue applicability of our method to some classes of problems, while also signalling the areas which require further research in order to expand this applicability domain

    Experimental evaluation of train and test split strategies in link prediction

    Get PDF
    In link prediction, the goal is to predict which links will appear in the future of an evolving network. To estimate the performance of these models in a supervised machine learning model, disjoint and independent train and test sets are needed. However, objects in a real-world network are inherently related to each other. Therefore, it is far from trivial to separate candidate links into these disjoint sets.Here we characterize and empirically investigate the two dominant approaches from the literature for creating separate train and test sets in link prediction, referred to as random and temporal splits. Comparing the performance of these two approaches on several large temporal network datasets, we find evidence that random splits may result in too optimistic results, whereas a temporal split may give a more fair and realistic indication of performance. Results appear robust to the selection of temporal intervals. These findings will be of interest to researchers that employ link prediction or other machine learning tasks in networks.Computer Systems, Imagery and Medi

    Navigability is a Robust Property

    Full text link
    The Small World phenomenon has inspired researchers across a number of fields. A breakthrough in its understanding was made by Kleinberg who introduced Rank Based Augmentation (RBA): add to each vertex independently an arc to a random destination selected from a carefully crafted probability distribution. Kleinberg proved that RBA makes many networks navigable, i.e., it allows greedy routing to successfully deliver messages between any two vertices in a polylogarithmic number of steps. We prove that navigability is an inherent property of many random networks, arising without coordination, or even independence assumptions

    The evolution of interdisciplinarity in physics research

    Get PDF
    Science, being a social enterprise, is subject to fragmentation into groups that focus on specialized areas or topics. Often new advances occur through cross-fertilization of ideas between sub-fields that otherwise have little overlap as they study dissimilar phenomena using different techniques. Thus to explore the nature and dynamics of scientific progress one needs to consider the large-scale organization and interactions between different subject areas. Here, we study the relationships between the sub-fields of Physics using the Physics and Astronomy Classification Scheme (PACS) codes employed for self-categorization of articles published over the past 25 years (1985-2009). We observe a clear trend towards increasing interactions between the different sub-fields. The network of sub-fields also exhibits core-periphery organization, the nucleus being dominated by Condensed Matter and General Physics. However, over time Interdisciplinary Physics is steadily increasing its share in the network core, reflecting a shift in the overall trend of Physics research.Comment: Published version, 10 pages, 8 figures + Supplementary Informatio

    Learning to Infer Social Ties in Large Networks

    Full text link
    Abstract. In online social networks, most relationships are lack of meaning labels (e.g., “colleague ” and “intimate friends”), simply because users do not take the time to label them. An interesting question is: can we automatically infer the type of social relationships in a large network? what are the fundamental factors that imply the type of social relation-ships? In this work, we formalize the problem of social relationship learn-ing into a semi-supervised framework, and propose a Partially-labeled Pairwise Factor Graph Model (PLP-FGM) for learning to infer the type of social ties. We tested the model on three different genres of data sets: Publication, Email and Mobile. Experimental results demonstrate that the proposed PLP-FGM model can accurately infer 92.7 % of advisor-advisee relationships from the coauthor network (Publication), 88.0 % of manager-subordinate relationships from the email network (Email), and 83.1 % of the friendships from the mobile network (Mobile). Finally, we develop a distributed learning algorithm to scale up the model to real large networks.

    A Game Theoretic Model for the Formation of Navigable Small-World Networks

    Full text link
    Kleinberg proposed a family of small-world networks to ex-plain the navigability of large-scale real-world social net-works. However, the underlying mechanism that drives real networks to be navigable is not yet well understood. In this paper, we present a game theoretic model for the for-mation of navigable small world networks. We model the network formation as a game in which people seek for both high reciprocity and long-distance relationships. We show that the navigable small-world network is a Nash Equilib-rium of the game. Moreover, we prove that the navigable small-world equilibrium tolerates collusions of any size and arbitrary deviations of a large random set of nodes, while non-navigable equilibria do not tolerate small group collu-sions or random perturbations. Our empirical evaluation further demonstrates that the system always converges to the navigable network even when limited or no information about other players ’ strategies is available. Our theoretical and empirical analyses provide important new insight on the connection between distance, reciprocity and navigability in social networks

    Risk-Averse Matchings over Uncertain Graph Databases

    Full text link
    A large number of applications such as querying sensor networks, and analyzing protein-protein interaction (PPI) networks, rely on mining uncertain graph and hypergraph databases. In this work we study the following problem: given an uncertain, weighted (hyper)graph, how can we efficiently find a (hyper)matching with high expected reward, and low risk? This problem naturally arises in the context of several important applications, such as online dating, kidney exchanges, and team formation. We introduce a novel formulation for finding matchings with maximum expected reward and bounded risk under a general model of uncertain weighted (hyper)graphs that we introduce in this work. Our model generalizes probabilistic models used in prior work, and captures both continuous and discrete probability distributions, thus allowing to handle privacy related applications that inject appropriately distributed noise to (hyper)edge weights. Given that our optimization problem is NP-hard, we turn our attention to designing efficient approximation algorithms. For the case of uncertain weighted graphs, we provide a 13\frac{1}{3}-approximation algorithm, and a 15\frac{1}{5}-approximation algorithm with near optimal run time. For the case of uncertain weighted hypergraphs, we provide a Ω(1k)\Omega(\frac{1}{k})-approximation algorithm, where kk is the rank of the hypergraph (i.e., any hyperedge includes at most kk nodes), that runs in almost (modulo log factors) linear time. We complement our theoretical results by testing our approximation algorithms on a wide variety of synthetic experiments, where we observe in a controlled setting interesting findings on the trade-off between reward, and risk. We also provide an application of our formulation for providing recommendations of teams that are likely to collaborate, and have high impact.Comment: 25 page

    Gender Detection on Social Networks using Ensemble Deep Learning

    Full text link
    Analyzing the ever-increasing volume of posts on social media sites such as Facebook and Twitter requires improved information processing methods for profiling authorship. Document classification is central to this task, but the performance of traditional supervised classifiers has degraded as the volume of social media has increased. This paper addresses this problem in the context of gender detection through ensemble classification that employs multi-model deep learning architectures to generate specialized understanding from different feature spaces

    From Relational Data to Graphs: Inferring Significant Links using Generalized Hypergeometric Ensembles

    Full text link
    The inference of network topologies from relational data is an important problem in data analysis. Exemplary applications include the reconstruction of social ties from data on human interactions, the inference of gene co-expression networks from DNA microarray data, or the learning of semantic relationships based on co-occurrences of words in documents. Solving these problems requires techniques to infer significant links in noisy relational data. In this short paper, we propose a new statistical modeling framework to address this challenge. It builds on generalized hypergeometric ensembles, a class of generative stochastic models that give rise to analytically tractable probability spaces of directed, multi-edge graphs. We show how this framework can be used to assess the significance of links in noisy relational data. We illustrate our method in two data sets capturing spatio-temporal proximity relations between actors in a social system. The results show that our analytical framework provides a new approach to infer significant links from relational data, with interesting perspectives for the mining of data on social systems.Comment: 10 pages, 8 figures, accepted at SocInfo201
    • …
    corecore